Clustering XML documents by patterns
نویسندگان
چکیده
منابع مشابه
Clustering XML Documents by Structure
While the processing and management of XML data are popular research issues, operations based on the structure of XML data have not yet received strong attention. These operations involve, among others, the grouping of structurally similar XML documents. Such grouping refers to the application of clustering methods using distances that estimate the similarity between tree structures. This paper...
متن کاملXCleaner: A New Method for Clustering XML Documents by Structure
Abstract: With the vastly growing data resources on the Internet, XML is one of the most important standards for document management. Not only does it provide enhancements to document exchange and storage, but it is also helpful in a variety of information retrieval tasks. Document clustering is one of the most interesting research areas that utilize XML’s semi-structural nature. In this paper,...
متن کاملA methodology for clustering XML documents by structure
The processing and management of XML data are popular research issues. However, operations based on the structure of XML data have not received strong attention. These operations involve, among others, the grouping of structurally similar XML documents. Such grouping results from the application of clustering methods with distances that estimate the similarity between tree structures. This pape...
متن کاملA Tree-Based Approach to Clustering XML Documents by Structure
We propose a novel methodology for clustering XML documents on the basis of their structural similarities. The basic idea is to equip each cluster with an XML cluster representative, i.e. an XML document subsuming the most typical structural specifics of a set of XML documents. Clustering is essentially accomplished by comparing cluster representatives, and updating the representatives as soon ...
متن کاملClustering XML Documents Using Structural Summaries
This work presents a methodology for grouping structurally similar XML documents using clustering algorithms. Modeling XML documents with tree-like structures, we face the ‘clustering XML documents by structure’ problem as a ‘tree clustering’ problem, exploiting distances that estimate the similarity between those trees in terms of the hierarchical relationships of their nodes. We suggest the u...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Knowledge and Information Systems
سال: 2015
ISSN: 0219-1377,0219-3116
DOI: 10.1007/s10115-015-0820-0